{
 "metadata": {
  "name": "",
  "signature": "sha256:5b1659e5ad6c9793528a984613d0c142e6f21398cc7406c9bcaee0fed91bff8e"
 },
 "nbformat": 3,
 "nbformat_minor": 0,
 "worksheets": [
  {
   "cells": [
    {
     "cell_type": "heading",
     "level": 1,
     "metadata": {},
     "source": [
      "Getting started with PlanOut"
     ]
    },
    {
     "cell_type": "heading",
     "level": 3,
     "metadata": {},
     "source": [
      "Your first experiment"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "This imports basic operators for doing random assignment and SimpleExperiment, the base class for logging"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "from planout.ops.random import *\n",
      "from planout.experiment import SimpleExperiment\n",
      "import pandas as pd"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "You can define new experiments by subclassing `SimpleExperiment`, and implementing an `assign()` method."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "class SignupExperiment(SimpleExperiment):\n",
      "  def assign(self, params, cookieid):\n",
      "    params.button_color = UniformChoice(\n",
      "      choices=[\"#ff0000\", \"#00ff00\"],\n",
      "      unit=cookieid)\n",
      "    params.button_text = UniformChoice(\n",
      "      choices=[\"Join now\", \"Sign me up!\"],\n",
      "      unit=cookieid) "
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "You can get randomized assignments for your input units by creating instances of the class. The code below gets the parameter values, `button_text` and `button_color` for `cookeid` = 4."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "e = SignupExperiment(cookieid=4)\n",
      "print e.get('button_text')\n",
      "print e.get('button_color')"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Here are the assignments for 10 userids."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "for i in xrange(10):\n",
      "    e = SignupExperiment(cookieid=i)\n",
      "    print \"cookie = %s: %s, %s\" % (i, e.get('button_text'), e.get('button_color'))"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "To check to see that the experiment is doing what we expect it to, we can simulate assignments for many userids and construct a dataframe with all of the assignments:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "sim_users = [SignupExperiment(cookieid=i).get_params() for i in xrange(1000)]\n",
      "assignments = pd.DataFrame.from_dict(sim_users)\n",
      "print assignments[:5]"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "print assignments.groupby(['button_text', 'button_color']).size()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "heading",
     "level": 3,
     "metadata": {},
     "source": [
      "Unequal probability assignment with `WeightedChoice`"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "The `WeightedChoice` operator lets you choose among multiple choices with different frequencies. The `weights` parameter is any set of weights (integer or floating point) to select among `choices`."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "class SignupExperiment2(SimpleExperiment):\n",
      "  def assign(self, params, cookieid):\n",
      "    params.button_color = UniformChoice(\n",
      "      choices=[\"#ff0000\", \"#00ff00\"],\n",
      "      unit=cookieid)\n",
      "    params.button_text = WeightedChoice(\n",
      "      choices=[\"Join now!\", \"Sign me up!\"],\n",
      "      weights=[8, 2],\n",
      "      unit=cookieid)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "The button text frequencies reflect these weights, while the button color continues to be split in equal proportions."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "sim_users = [SignupExperiment2(cookieid=i).get_params() for i in xrange(2000)]\n",
      "assignments = pd.DataFrame.from_dict(sim_users)\n",
      "print assignments.groupby(['button_text', 'button_color']).size()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "heading",
     "level": 3,
     "metadata": {},
     "source": [
      "41 shades of blue with `RandomInteger`"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Google is infamous for testing [41 different shades of blue](http://www.nytimes.com/2009/03/01/business/01marissa.html) for their link colors. Let's implement that experiment."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "class ColorExperiment(SimpleExperiment):\n",
      "  def assign(self, params, userid):\n",
      "    params.blue_value = RandomInteger(min=215, max=255, unit=userid)\n",
      "    params.button_color = '#0000%s' % format(params.blue_value, '02x')\n",
      "    params.button_text = 'Join now!'"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ColorExperiment(userid=10).get_params()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "sim_users = [ColorExperiment(userid=i).get_params() for i in xrange(20000)]\n",
      "assignments = pd.DataFrame.from_dict(sim_users)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "assignments[:5]"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "assignments['blue_value'].hist(bins=41);"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "heading",
     "level": 3,
     "metadata": {},
     "source": [
      "Using multiple input units to implement within-subjects designs"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "In some cases you might want to assign user-item pairs or user-session pairs to parameters. You can do this by simply passing more units into `assign()` and applying multiple units."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "class SearchRankingExperiment(SimpleExperiment):\n",
      "  def assign(self, params, userid, sessionid):\n",
      "    params.ranking_model = UniformChoice(choices=['v0','v212'], unit=[userid, sessionid])\n",
      "\n",
      "print SearchRankingExperiment(userid=8, sessionid=1).get('ranking_model')\n",
      "print SearchRankingExperiment(userid=8, sessionid=3).get('ranking_model')"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "heading",
     "level": 3,
     "metadata": {},
     "source": [
      "Chaining together random assignments"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Let's consider a case where you had multiple ranking models you wanted to compare against the control, and you were interested in doing a within-subjects design. You could first assign users to candidate models, and then assign users to either see the control or candidate model."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "class SearchRankingExperiment(SimpleExperiment):\n",
      "  def assign(self, params, userid, sessionid):\n",
      "    params.candidate_model = UniformChoice(choices=['v212', 'v213', 'v214'], unit=userid)\n",
      "    params.ranking_model = UniformChoice(choices=['v0', params.candidate_model], unit=[userid, sessionid])\n",
      "\n",
      "for s in xrange(5):\n",
      "    print SearchRankingExperiment(userid=3, sessionid=s).get('ranking_model')"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "heading",
     "level": 1,
     "metadata": {},
     "source": [
      "How random assignment works in PlanOut"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "PlanOut hashes input data to provide developers with randomized parameter assignments. Unless otherwise specified, all random assignments are independent. It combines experiment-level and parameter-level salts with the given input units in a way that ensures that:\n",
      "  * The same units (e.g., user ids) get mapped to different values for different experiments or parameters.\n",
      "  * Assignments are as good as random.\n",
      "\n",
      "Underneath the hood, PlanOut computes a hash that looks like\n",
      "\n",
      "    f(SHA1(experiment_name.parameter_name.unit_id))\n",
      "\n",
      "So for example, in the experiment below, PlanOut computes something that looks like:\n",
      "\n",
      "    SHA1(RandomExample1.x.4) % 2\n",
      "  \n",
      "to select the value for `x` when the given `userid` is 4."
     ]
    },
    {
     "cell_type": "heading",
     "level": 3,
     "metadata": {},
     "source": [
      "Parameter-level salts"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "class RandomExample1(SimpleExperiment):\n",
      "  def assign(self, params, userid):\n",
      "    params.x = UniformChoice(choices=[0, 1], unit=userid)\n",
      "    params.y = UniformChoice(choices=['a','b'], unit=userid)\n",
      "    \n",
      "sim_users = [RandomExample1(userid=i).get_params() for i in xrange(2000)]\n",
      "assignments = pd.DataFrame.from_dict(sim_users)\n",
      "print assignments.groupby(['x', 'y']).size()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "PlanOut automatically \"salts\" each random assignment operator with the name of the parameter you are assigning. By writing `params.foo = Bar(...)`, you are implicitly passing the salt, \"foo\", into `Bar()`. The following experiment is equivalent to the code above."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "class RandomExample1(SimpleExperiment):\n",
      "  def assign(self, params, userid):\n",
      "    params.x = UniformChoice(choices=[0, 1], unit=userid, salt='x')\n",
      "    params.y = UniformChoice(choices=['a','b'], unit=userid, salt='y')\n",
      "    \n",
      "sim_users = [RandomExample1(userid=i).get_params() for i in xrange(2000)]\n",
      "assignments = pd.DataFrame.from_dict(sim_users)\n",
      "print assignments.groupby(['x', 'y']).size()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Comparing the cross tabs from the first and second experiments, it's clear that the two experiments produce identical assignments."
     ]
    },
    {
     "cell_type": "heading",
     "level": 4,
     "metadata": {},
     "source": [
      "Changing the salts change the assignments:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "class RandomExample1(SimpleExperiment):\n",
      "  def assign(self, params, userid):\n",
      "    params.x = UniformChoice(choices=[0, 1], unit=userid, salt='x2')\n",
      "    params.y = UniformChoice(choices=['a','b'], unit=userid, salt='y2')\n",
      "    \n",
      "sim_users = [RandomExample1(userid=i).get_params() for i in xrange(2000)]\n",
      "assignments = pd.DataFrame.from_dict(sim_users)\n",
      "print assignments.groupby(['x', 'y']).size()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "heading",
     "level": 4,
     "metadata": {},
     "source": [
      "Parameters with the same salt will have correlated assignments. If you use the same salt for the exact same kind of random operation, your assignments will be perfectly correlated."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "class RandomExample1(SimpleExperiment):\n",
      "  def assign(self, params, userid):\n",
      "    params.x = UniformChoice(choices=[0, 1], unit=userid, salt='x')\n",
      "    params.y = UniformChoice(choices=['a','b'], unit=userid, salt='x')\n",
      "    \n",
      "sim_users = [RandomExample1(userid=i).get_params() for i in xrange(2000)]\n",
      "assignments = pd.DataFrame.from_dict(sim_users)\n",
      "print assignments.groupby(['x', 'y']).size()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "heading",
     "level": 3,
     "metadata": {},
     "source": [
      "Experiment-level salts"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Each experiment also has its own salt. This makes it so that parameters with the same name will have independent random assignments, and also allows you to synchronize assignments across experiments in special situations."
     ]
    },
    {
     "cell_type": "heading",
     "level": 4,
     "metadata": {},
     "source": [
      "By default, experiment class names are used as experiment-level salts"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "This keeps parameter assignments for parameters with the same name independent of one another."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "class RandomExample1(SimpleExperiment):\n",
      "  def assign(self, params, userid):\n",
      "    params.x = UniformChoice(choices=[0, 1], unit=userid)\n",
      "    params.y = UniformChoice(choices=['a','b'], unit=userid)\n",
      "    \n",
      "class RandomExample2(SimpleExperiment):\n",
      "  def assign(self, params, userid):\n",
      "    params.x = UniformChoice(choices=[4, 8], unit=userid)\n",
      "    params.y = UniformChoice(choices=['m','n'], unit=userid)\n",
      "\n",
      "sim_users = [RandomExample1(userid=i).get_params() for i in xrange(4000)]\n",
      "assignments = pd.DataFrame.from_dict(sim_users)\n",
      "print assignments.groupby(['x', 'y']).size()\n",
      "\n",
      "sim_users = [RandomExample2(userid=i).get_params() for i in xrange(4000)]\n",
      "assignments = pd.DataFrame.from_dict(sim_users)\n",
      "print assignments.groupby(['x', 'y']).size()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "heading",
     "level": 4,
     "metadata": {},
     "source": [
      "Experiment-level salts can be specified by setting `self.salt`"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "The `self.salt` attribute of an experiment object specifies the experiment-level salt. You can set this attribute in the `setup()` method, which gets called before any assignments take place."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "class RandomExample1(SimpleExperiment):\n",
      "  def setup(self):\n",
      "        self.salt = 'RandomExample2'\n",
      "\n",
      "  def assign(self, params, userid):\n",
      "    params.x = UniformChoice(choices=[0, 1], unit=userid)\n",
      "    params.y = UniformChoice(choices=['a','b'], unit=userid)\n",
      "\n",
      "sim_users = [RandomExample2(userid=i).get_params() for i in xrange(4000)]\n",
      "assignments = pd.DataFrame.from_dict(sim_users)\n",
      "print assignments.groupby(['x', 'y']).agg(len)\n"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "heading",
     "level": 3,
     "metadata": {},
     "source": [
      "Additional notes on random assignment"
     ]
    },
    {
     "cell_type": "heading",
     "level": 4,
     "metadata": {},
     "source": [
      "Random assignment with multiple units"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "When multiple units are used (e.g., in the case when user-item pairs are assigned to parameters in a within-subjects design), units are concatinated, so that if the input units are `userid=4` and `url='http://news.ycombinator.com'`, the hash operation would look like:\n",
      "\n",
      "    f(SHA1('RandomExperiment1.show_thumbnail.6.http://news.ycombinator.com'))"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "class RandomExample1(SimpleExperiment):\n",
      "  def assign(self, params, userid, url):\n",
      "    params.show_thumbnail = BernoulliTrial(p=0.15, unit=[userid, url])\n",
      "\n",
      "RandomExample1(userid=6, url='http://news.ycombinator.com').get('show_thumbnail')"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Note that since the names of units are not used, assignment does depend on the order of the units:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "class RandomExample1(SimpleExperiment):\n",
      "  def assign(self, params, userid, url):\n",
      "    params.show_thumbnail = BernoulliTrial(p=0.15, unit=[url, userid])\n",
      "\n",
      "RandomExample1(userid=6, url='http://news.ycombinator.com').get('show_thumbnail')"
     ],
     "language": "python",
     "metadata": {},
     "outputs": []
    },
    {
     "cell_type": "heading",
     "level": 4,
     "metadata": {},
     "source": [
      "Namespaces"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "When an experiment is running under a namespace, the namespace name is concatenated with the experiment-level salt. See the namespace tutorial for more details."
     ]
    }
   ],
   "metadata": {}
  }
 ]
}